A statistical phonemic segment model for speech recognition based on automatic phonemic segmentation
نویسندگان
چکیده
This paper presents a method of constructing a statistical phonemic segment model (SPSM) for a speech recognition system based on speaker-independent context-independent automatic phonemic segmentation. In our recent research, we proposed the phoneme recognition system using the template matching method with the same segmentation, and confirmed that 5-frame-fixed time sequence of feature vectors used as a template represents features of phoneme effectively. This time, to improve a mass of these templates to a smarter model, we introduced a statistical method into modeling. The structure of SPSM connects 5 distributions of Gaussian N-mixture density in series. By the experiment of closed Japanese spoken word recognition, using VCV balanced 4920 words spoken by 10 male adults including 34430 phonemes in total, the rate of phoneme recognition using SPSM was up to 90.23 % compared with the rate using phoneme templates, 80.39 %.
منابع مشابه
Integrating Statistical and Knowledge - based Methods for Automatic Phonemic Segmentation
This thesis presents a prototype system, which integrates statistical and knowledgebased methods, for automatic phonemic segmentation of speech utterances for use in speech production research. First, Aligner, a commercial speech alignment software, synchronizes the speech waveform to the provided text, using hidden Markov models that were trained on phones. Then, a custom built knowledge-based...
متن کاملStatistical analysis of orthographic and phonemic language corpus for word-based and phoneme-based Polish language modelling
This article presents the original results of Polish language statistical analysis, based on the orthographic and phonemic language corpus. Phonemic language corpus for Polish was developed by using automatic grapheme-to-phoneme conversion of the source orthographic language corpus, obtained from the National Corpus of Polish (NCP). The corpus contains the most frequently used Polish words, wri...
متن کاملA knowledge-based nasal classifier for use in continuous speech recognition
In a phoneme-based speaker-adaptive automatic recognition system for continuous English speech, a segmentation algorithm for nasals uses automatically derived thresholds on spectral energy measures. A Gaussian classifier using formant information with duration, two compound measures, 'and a 'spectral contrast' measure, is applied to the hypothesised nasal segments, which are classified phonemic...
متن کاملAllophone-based acoustic modeling for Persian phoneme recognition
Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...
متن کاملAutomatic Segmentation of Greek Speech Signals to Broad Phonemic Classes
In this paper, we evaluate an implicit approach for the automatic detection of broad phonemic class boundaries of continuous speech signals. The reported method is consisted of the prior segmentation of speech signal into pitch-synchronous segments, using pitchmark locations, for the computation of adjacent broad phonemic class boundaries. The approach’s validity was tested on a phonetically ri...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998